Upgrade to latest version of azure-search-documents and agentic retrieval API #2723

pamelafox · 2025-09-09T19:31:16Z

Purpose

This pull request introduces a major refactor to the agentic retrieval integration, updating the codebase to use the latest Azure AI Search agentic retrieval API.

The new API can optionally include the reference source data (all the fields from each chunk), so we no longer need explicit hydration.

The new API does not support passing in max subqueries at query time, so I've removed that as a Developer Setting. That can only be customized in the search manager, at agent creation time.

This is the changelog for the package upgrade:
https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md

And these are the API specs:
https://github.com/Azure/azure-rest-api-specs/blob/a71c94fb88b21af5c99442fd138b2570fc29622b/specification/search/data-plane/Azure.Search/preview/2025-08-01-preview/searchservice.json#L2701

Agentic retrieval API and data model updates:

Replaced legacy agentic retrieval classes and parameters (such as KnowledgeAgentAzureSearchDocReference, KnowledgeAgentIndexParams, and hydration logic) with new types (KnowledgeAgentSearchIndexReference, SearchIndexKnowledgeSourceParams, etc.) and simplified reference handling in approach.py. Removed unused hydration and reranker-related code. [1] [2] [3] [4]
Updated agent creation in searchmanager.py to use SearchIndexKnowledgeSource and KnowledgeSourceReference instead of KnowledgeAgentTargetIndex, and now explicitly selects source fields and reference options. [1] [2]

Parameter and code cleanup:

Removed the hydrate_references, minimum_reranker_score, and max_docs_for_reranker parameters from constructors and method calls in approach.py, chatreadretrieveread.py, and retrievethenread.py. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Dependency updates:

Upgraded the azure-search-documents package to version 11.7.0b1 in both requirements.in and requirements.txt to support the new agent API features. [1] [2]

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[X] Yes - It may, if they are using agentic. Hopefully it won't because I gave the agent a new name (suffix of '-upgrade'), so it won't try to use the old agent with incompatible configuration.
[ ] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[X] No

Type of change

[ ] Bugfix
[X] Feature
[ ] Code style update (formatting, local variables)
[X] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

The current tests all pass (python -m pytest).
I added tests that prove my fix is effective or that my feature works
I ran python -m pytest --cov to verify 100% coverage of added lines
I ran python -m mypy to check for type errors
I either used the pre-commit hooks or ran ruff and black manually on my code.

taylorn-ai · 2025-09-10T05:34:00Z

@pamelafox - where are you at with this? Do you need a hand with anything?

pamelafox · 2025-09-10T05:38:41Z

@taylorn-ai Tests just passed, and I just verified this is working with the multimodal feature, so this is ready for review! I'd love if you want to review the code and/or check out the branch to see if it works for you.

Copilot

Pull Request Overview

This pull request upgrades the Azure Search Documents SDK to version 11.7.0b1 and refactors the agentic retrieval integration to use the latest API. The new API includes reference source data directly, eliminating the need for explicit hydration, and removes runtime customization of max subqueries.

Key changes include:

Upgraded azure-search-documents to 11.7.0b1 for latest agentic retrieval API support
Replaced legacy agentic retrieval classes with new API types and simplified reference handling
Removed max_subqueries parameter and hydration-related code as these are no longer supported

Reviewed Changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
app/backend/requirements.in	Updated azure-search-documents version to 11.7.0b1
app/backend/requirements.txt	Updated dependencies with new azure-search-documents version
app/backend/approaches/approach.py	Replaced legacy agentic retrieval types with new API classes and removed hydration logic
app/backend/approaches/chatreadretrieveread.py	Removed hydrate_references parameter and max_docs_for_reranker calculations
app/backend/approaches/retrievethenread.py	Removed hydrate_references parameter and max_docs_for_reranker calculations
app/backend/prepdocslib/searchmanager.py	Updated agent creation to use SearchIndexKnowledgeSource and new reference types
app/backend/app.py	Removed ENABLE_AGENTIC_RETRIEVAL_SOURCE_DATA environment variable usage
app/frontend/src/api/models.ts	Removed max_subqueries from ChatAppRequestOverrides type
app/frontend/src/pages/chat/Chat.tsx	Removed max subqueries UI setting and state management
app/frontend/src/pages/ask/Ask.tsx	Removed max subqueries UI setting and state management
app/frontend/src/components/Settings/Settings.tsx	Removed max subqueries input field from developer settings
app/frontend/src/components/AnalysisPanel/AgentPlan.tsx	Updated activity record type names and property access
app/frontend/src/locales/*/translation.json	Removed max subqueries translations from all language files
infra/main.bicep	Removed enableAgenticRetrievalSourceData parameter and updated agent name suffix
infra/main.parameters.json	Removed ENABLE_AGENTIC_RETRIEVAL_SOURCE_DATA parameter mapping
docs/*.md	Updated documentation to remove compatibility warnings and max subqueries references
evals/*.json	Removed max_subqueries from evaluation configuration files
tests/	Updated test mocks and removed hydration-related test cases

Comments suppressed due to low confidence (1)

app/frontend/src/components/AnalysisPanel/AgentPlan.tsx:1

The AzureSearchQueryStep type includes a query_time field that doesn't appear to be used anywhere in the component. Consider removing this unused field or documenting why it's included if it's intended for future use.

import React from "react";

app/backend/approaches/approach.py

taylorn-ai · 2025-09-10T06:00:38Z

@pamelafox - looks good to me, nice work :)

I did notice however, that many of the translation files are missing some keys. I noticed this only because you removed the maxSubqueryCount from some lang files, but not all, so I ran i18n-check.

file	key
src/locales/da/translation.json	labels.resultsMergeStrategy
src/locales/da/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/da/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/da/translation.json	helpTexts.resultsMergeStrategy
src/locales/da/translation.json	helpTexts.llmTextInputs
src/locales/da/translation.json	helpTexts.llmImageInputs
src/locales/es/translation.json	labels.resultsMergeStrategy
src/locales/es/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/es/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/es/translation.json	helpTexts.resultsMergeStrategy
src/locales/es/translation.json	helpTexts.llmTextInputs
src/locales/es/translation.json	helpTexts.llmImageInputs
src/locales/fr/translation.json	labels.resultsMergeStrategy
src/locales/fr/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/fr/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/fr/translation.json	helpTexts.resultsMergeStrategy
src/locales/fr/translation.json	helpTexts.llmTextInputs
src/locales/fr/translation.json	helpTexts.llmImageInputs
src/locales/it/translation.json	labels.resultsMergeStrategy
src/locales/it/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/it/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/it/translation.json	helpTexts.resultsMergeStrategy
src/locales/it/translation.json	helpTexts.llmTextInputs
src/locales/it/translation.json	helpTexts.llmImageInputs
src/locales/ja/translation.json	helpTexts.llmTextInputs
src/locales/ja/translation.json	helpTexts.llmImageInputs
src/locales/nl/translation.json	labels.resultsMergeStrategy
src/locales/nl/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/nl/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/nl/translation.json	helpTexts.resultsMergeStrategy
src/locales/nl/translation.json	helpTexts.llmTextInputs
src/locales/nl/translation.json	helpTexts.llmImageInputs
src/locales/ptBR/translation.json	labels.resultsMergeStrategy
src/locales/ptBR/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/ptBR/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/ptBR/translation.json	helpTexts.resultsMergeStrategy
src/locales/ptBR/translation.json	helpTexts.llmTextInputs
src/locales/ptBR/translation.json	helpTexts.llmImageInputs
src/locales/tr/translation.json	labels.resultsMergeStrategy
src/locales/tr/translation.json	labels.resultsMergeStrategyOptions.interleaved
src/locales/tr/translation.json	labels.resultsMergeStrategyOptions.descending
src/locales/tr/translation.json	helpTexts.resultsMergeStrategy
src/locales/tr/translation.json	helpTexts.llmTextInputs
src/locales/tr/translation.json	helpTexts.llmImageInputs

pamelafox · 2025-09-10T06:04:46Z

@taylorn-ai Oo thanks! I did not know about i18n-check, that sounds like a new CI check that we need.
Let me generate those translations and see if I can get some human reviewers to check them.

taylorn-ai · 2025-09-10T06:08:06Z

The one issue I have with it is that it uses i18next-parser which then uses some outdated dependencies see here, but for dev, not really an issue.

Also, I used i18n-auto-translation to translate, and you can even use Azure AI Translation with it, just thought I would mention that too :)

pamelafox · 2025-09-10T06:16:52Z

@taylorn-ai Yep, you're right, it does have a bunch of dependency warnings. I've added it to the CI using npx so that it doesnt have to go in the package.json at all. I generated the translations with GPT-5 in Copilot, which does a decent job usually, but I'll ping some human i18n reviewers too.

taylorn-ai · 2025-09-10T06:21:44Z

Actually, something I just noticed, instead of hard coding the field names, maybe they should be fetched dynamically?

e.g.

client = SearchIndexClient(endpoint=endpoint, credential=DefaultAzureCredential())
index = client.get_index(index_name)
field_names = [f.name for f in index.fields if f.searchable]
...
source_data_select = ",".join(field_names)
...

Or, perhaps better...

from dataclasses import fields
from approaches.approach import Document
skip_fields = {"score", "score", "reranker_score", "search_agent_query"}
search_fields = [f.name for f in fields(Document) if f.name not in skip_fields]

app/frontend/src/locales/ja/translation.json

taylorn-ai · 2025-09-10T06:50:50Z

@pamelafox - sorry for the spam, but I did actually notice an issue, not specifically related to this PR, but it made me remember.

It seems at some point, @search.reranker_score was renamed to @search.rerankerScore. All your tests pass because you use reranker_score as the field name, but I would imagine, in your Document class, its likely not returning anything as it uses reranker_score=document.get("@search.reranker_score"),

app/frontend/src/locales/fr/translation.json

app/frontend/src/locales/es/translation.json

Co-authored-by: Gwen Peña-Siguenza <[email protected]> Co-authored-by: Wassim Chegham <[email protected]> Co-authored-by: Anthony Shaw <[email protected]>

pamelafox · 2025-09-10T16:13:51Z

@taylorn-ai Hm, I just printed out the values in approach.py from AI Search (non-agentic), and it shows the score for @search.reranker_score, but a null value from @search.rerankerScore
Where do you see that it got renamed?

Co-authored-by: Copilot <[email protected]>

pamelafox · 2025-09-10T18:37:30Z

@taylorn-ai The PR is merged, but do follow-up on rerankerScore if you still see an issue (here or with new issue)

taylorn-ai · 2025-09-10T22:25:30Z

The documentation says that the field is called @score.rerankderScore and when you use the search index browser in Azure it returns it this same way:

"@search.score": 36.73281,
"@search.rerankerScore": 2.6607306003570557,

However, after looking a bit further, it seems just the SDK returns it as reranker_score, so you are correct, my bad - just very confusing!

pamelafox · 2025-09-11T00:23:49Z

@taylorn-ai I asked @mattgotteiner and he says that's due to the Python SDK explicitly snake_casing the API return values.

pamelafox added 6 commits September 8, 2025 17:40

Move to new azure-search-documents package

10a2c49

Remove max subqueries option, fix sort order

ccc28e5

Revert whitespace changes

a547b1e

Remove whitespace changes

78187e4

Merge from main

53df4df

Revert whitespace change

adf80e9

pamelafox changed the title ~~Upgrade to latest version of azure-search-documents and agentic retrieval API~~ [WIP] Upgrade to latest version of azure-search-documents and agentic retrieval API Sep 9, 2025

pamelafox marked this pull request as draft September 9, 2025 19:31

pamelafox added 2 commits September 9, 2025 13:55

Update tests

3de71da

Update docs and make mypy happy

9bbb423

pamelafox mentioned this pull request Sep 10, 2025

feat: Add extra search index fields to Knowledge Agent response #2696

Merged

5 tasks

Update tests and docs

e28e3a4

pamelafox changed the title ~~[WIP] Upgrade to latest version of azure-search-documents and agentic retrieval API~~ Upgrade to latest version of azure-search-documents and agentic retrieval API Sep 10, 2025

pamelafox marked this pull request as ready for review September 10, 2025 05:39

pamelafox requested review from Copilot and mattgotteiner September 10, 2025 05:39

This comment was marked as outdated.

Sign in to view

Remove captions as they shouldnt be in agentic retrieval mocks

ca509ac

pamelafox requested a review from Copilot September 10, 2025 05:52

Copilot AI reviewed Sep 10, 2025

View reviewed changes

app/backend/approaches/approach.py Outdated Show resolved Hide resolved

Fix missing translations

54fa985

tonybaloney reviewed Sep 10, 2025

View reviewed changes

app/frontend/src/locales/ja/translation.json Outdated Show resolved Hide resolved

tonybaloney reviewed Sep 10, 2025

View reviewed changes

app/frontend/src/locales/ja/translation.json Outdated Show resolved Hide resolved

manekinekko reviewed Sep 10, 2025

View reviewed changes

app/frontend/src/locales/fr/translation.json Outdated Show resolved Hide resolved

app/frontend/src/locales/fr/translation.json Outdated Show resolved Hide resolved

madebygps suggested changes Sep 10, 2025

View reviewed changes

Apply i18n suggestions from code review

c52fd1a

Co-authored-by: Gwen Peña-Siguenza <[email protected]> Co-authored-by: Wassim Chegham <[email protected]> Co-authored-by: Anthony Shaw <[email protected]>

mattgotteiner approved these changes Sep 10, 2025

View reviewed changes

Update app/backend/approaches/approach.py

5ee2db7

Co-authored-by: Copilot <[email protected]>

HeidiSteen approved these changes Sep 10, 2025

View reviewed changes

HeidiSteen merged commit 305ab5b into Azure-Samples:main Sep 10, 2025
29 checks passed

Upgrade to latest version of azure-search-documents and agentic retrieval API #2723

Upgrade to latest version of azure-search-documents and agentic retrieval API #2723

Uh oh!

Conversation

pamelafox commented Sep 9, 2025

Purpose

Does this introduce a breaking change?

Does this require changes to learn.microsoft.com docs?

Type of change

Code quality checklist

Uh oh!

taylorn-ai commented Sep 10, 2025

Uh oh!

pamelafox commented Sep 10, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

taylorn-ai commented Sep 10, 2025

Uh oh!

pamelafox commented Sep 10, 2025

Uh oh!

taylorn-ai commented Sep 10, 2025

Uh oh!

pamelafox commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taylorn-ai commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

taylorn-ai commented Sep 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pamelafox commented Sep 10, 2025

Uh oh!

Uh oh!

pamelafox commented Sep 10, 2025

Uh oh!

taylorn-ai commented Sep 10, 2025

Uh oh!

pamelafox commented Sep 11, 2025

Uh oh!

Uh oh!

pamelafox commented Sep 10, 2025 •

edited

Loading

taylorn-ai commented Sep 10, 2025 •

edited

Loading